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Abstract. The problem of recovering a matrix of low rank from an incomplete and possibly noisy set of 
linear measurements arises in a number of areas such as quantum state tomography, machine learning and 
the PhaseLift approach to phaseless reconstruction problems. In order to derive rigorous recovery results, the 
measurement map is usually modeled probabilistically and convex optimization approaches including nuclear 
norm minimization are often used as recovery method. In this article, we derive sufficient conditions on the 
minimal amount of measurements that ensure recovery via convex optimization. We establish our results 
via certain properties of the null space of the measurement map. In the setting where the measurements are 
realized as Frobenius inner products with independent standard Gaussian random matrices we show that 
m > 10r(ni + 722 ) measurements are enough to uniformly and stably recover an n\ x 712 matrix of rank 
at most r. Stability is meant both with respect to passing from exactly rank-r matrices to approximately 
rank-r matrices and with respect to adding noise on the measurements. We then significantly generalize this 
result by only requiring independent mean-zero, variance one entries with four finite moments at the cost of 
replacing 10 by some universal constant. We also study the particular case of recovering Hermitian rank-r 
matrices from measurement matrices proportional to rank-one projectors. For r = 1, such a problem reduces 
to the PhaseLift approach to phaseless recovery, while the case of higher rank is relevant for quantum state 
tomography. For m > Cm rank-one projective measurements onto independent standard Gaussian vectors, 
we show that nuclear norm minimization uniformly and stably reconstructs Hermitian rank-r matrices with 
high probability. Subsequently, we partially de-randomize this result by establishing an analogous statement 
for projectors onto independent elements of a complex projective 4-designs at the cost of a slightly higher 
sampling rate m > CrnXo^n. Gomplex projective t-designs are discrete sets of vectors whose uniform 
distribution reproduces the first t moments of the uniform distribution on the sphere. Moreover, if the 
Hermitian matrix to be recovered is known to be positive semidefinite, then we show that the nuclear norm 
minimization approach may be replaced by the simpler optimization program of minimizing the ^ 2 -Rorm 
of the residual subject to the positive semidefinite constraint. This has the additional advantage that no 
estimate of the noise level is required a priori. We discuss applications of such a result in quantum physics 
and the phase retrieval problem. Apart from the case of independent Gaussian measurements, the analysis 
exploits Mendelson’s small ball method. 

Keywords, low rank matrix recovery, quantum state tomography, phase retrieval, convex optimization, 
nuclear norm minimization, positive semidefinite least squares problem, complex projective designs, random 
measurements 
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1. Introduction 

In recent years, the recovery of objects (signals, images, matrices, quantum states etc.) from incomplete 
linear measurements has gained significant interest. While standard compressive sensing considers the 
reconstruction of (approximately) sparse vectors [26], we study extensions to the recovery of (approximately) 
low rank matrices from a small number of random measurements. This problem arises in a number of areas 
such as quantum tomography [30, 24, 6], signal processing [2], recommender systems [16, II] and phaseless 
recovery [12, 10, 28, 29]. On the one hand, we consider both random measurement maps generated by 
independent random matrices with independent entries and on the other hand, measurements with respect 
to independent rank one measurements. We derive bounds for the number of required measurements in 
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terms of the matrix dimensions and the rank of the matrix that guarantee successful recovery via nuclear 
norm minimization. Our results are uniform and stable with respect to noise on the measurements and with 
respect to passing to approximately rank-r matrices. For rank-one measurements the latter stability result 
is new. 

Let us formally describe our setup. We consider measurements of an (approximately) low-rank matrix 
X G of the form b — A(X), where the linear measurement map A is given as 

m 

.A : C”ix"^ ^ C™, Z (1) 

f=i 

Here, ei,..., e™ denote the standard basis vectors in C™ and Ai,, Am G C"^x "2 g^j-g galled measurement 
matrices. A prominent approach [22, 56] for recovering the matrix X from b — A{X) consists in computing 
the minimizer of the convex optimization problem 

min ll-^ll* subject to A{Z) = b, (2) 

2 gC"lX "2 

where ||^||* = \\Z\\i = denotes the nuclear norm with aj{Z) being the singular values of 

Z G C”iX "2 ggj _ min{ni,n 2 }. Efficient optimization methods exist for this problem [55, 8 ]. In practice 
the measurements are often perturbed by noise, i.e., 

b = A{X) + w, (3) 

where w G C™ is a vector of perturbations. In this case, we replace (2) by the noise constrained nuclear 
norm minimization problem 


min ll^jU subject to ||A(Z) — 6 |b < 77 , (4) 

zeC”ix»2 

where rj corresponds to a known estimate of the noise level, i.e., ||rc ||^2 ^ I with ||a:||^p = \xjY‘Y/P being 
the usual ip-norm. In some cases it is known a priori that the matrix X of interest is both Hermitian and 
positive semidefinite {X ^ 0). Then one may replace (4) by the optimization problem 

mmtr(Z) subject to \\A{Z) — hWi^ < rj. (5) 

However, as we will see, the simpler least squares problem 

mm||A(Z) - ( 6 ) 

works equally well or even better in terms of recovery under certain natural conditions. Apart from simplicity 
and computational efficiency it has the additional advantage that no estimate rj of the noise level is required. 
We note that other efficient recovery methods exist as well [46, 25, 64], but we will not go into details here. 

A question of central interest concerns the minimal number m of required measurements that guarantees 
exact (in the noiseless case) or approximate recovery. While it is very hard to study this question for 
deterministic measurement maps A, several results are available for certain models of random maps. We 
will study several scenarios which all have in common that the matrices Ai,. ■ ■ ,Am G gj-g 

independent draws of a random matrix $ = {Xij)ij. We first consider the real-valued case, where all entries 
Xij are independent and then move to a complex-valued scenario where $ = aa* G is a rank one 

matrix generated by a random vector a G C". For the latter scenario we consider a being a complex 
Gaussian random vector, or a being randomly drawn from a so-called (approximate) t-design. This last 
setup has implications for quantum tomography and this part of the article can be seen as a continuation of 
the investigations in [43]. Next, we describe the present state of the art of of the various setups and present 
our results. 
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1.1. Robust recovery from measurement matrices with independent entries. We call^l a Gaussian 
measurement map if the matrices Ai,..., Am S independent realizations of Gaussian 

random matrices, i.e., all entries of the Aj are independent standard Gaussian random variables. More 
generally, A is called subgaussian, if the entries of all the Aj are independent, mean zero, variance one, 
subgaussian random variables, where we recall that a random variable ^ is called subgaussian if P(|^| >t)< 
for some constant c > 0. If 

m>Cr(ni+n2) (7) 


for some universal constant C > 0, then with probability at least 1 — any rank r matrix X € ([;;"ix "2 jg 
reconstructed exactly from subgaussian measurements b = A{X) via nuclear norm minimization (2) [56, 15]. 
Moreover, if noisy measurements b = A{X) + w with ||w|l 2 < rj of an arbitrary matrix X G C"ixra 2 gj-g 
taken, then the minimizer X'^ of (4) satisfies, again with probability at least 1 — 


\\X-X^F<^ inf 

yT Z;rank(Z)<r 


IIX-ZIU + 



( 8 ) 


where IIAIIf = 


^ytT{A*A) denotes the Frobenius norm, tr being the trace. Note that 


inf ||X-Z|L 

Z:rank(Z)<r 


^ a,iX) = mU, 

j=r+l 


where the singular values aj (X) are arranged in decreasing order and for X with singular value decomposition 
the matrix Xc = The error estimate (8) means that reconstruction is 

robust with respect to noise on the measurements and stable with respect to passing to only approximately 
low rank matrices. These statements are uniform in the sense that they hold for all matrices X simultaneously 
once the matrix A has been drawn. They have been established in [15, 52, 56] via the rank restricted isometry 
property (rank-RIP), see e.g. [26] for the standard RIP and its implications. 

While the RIP is a standard tool by now, recovery of low rank matrices via nuclear norm minimization 
is characterized by the so-called null space property [51, 58, 57, 26, 25], see below for details. By using this 
concept, we are able to significantly relax from subgaussian distributions of the entries to distributions with 
only four finite moments. 


Theorem 1. Let A : A{X) = tr(X24j)ej, where the Aj are independent copies of a 

random matrix $ = (Xij)ij with independent mean zero entries obeying ^Xfj = 1 and 

< C 4 for all i,j and some constant C^. 

Fix 1 < r < min{ni, 712 } arid 0 < p < 1 and set 

m > Cip~^r{ni + 712 ). 

Then with probability at least 1 — for any X G R"ix »2 ^/jg solution X^ of (f) with b = A{X) + w, 

IIR’11^2 ^ rjj approximates X with error 


II^-^“IIf< 


2(1+ P)^ 

(1 - p)^ 


||Xc||* + 


(3 + p) 

(1 - P)C3 


Here ci, 02,03 are positive constants that only depend on C^. 



( 9 ) 


In the special case, when has independent standard Gaussian entries, we apply Gordon’s escape through 
a mesh theorem [27] in order to obtain an explicit constant in the estimate for the number of measurements, 
see Theorem 19. Roughly speaking, with high probability, any ni x 772 matrix of rank r is stably recovered 
from 771 > 10r(77i + 712 ) Gaussian measurements. We remark that the explicit bound m > Zr{ni + 712 ) has 
been derived in [18], (see also [49] and [4, Section 4.4] for a phase transition result in this context), but 
this bound considers nonuiform recovery, i.e. recovery of a fixed low rank matrix with a random draw of a 
Gaussian measurement matrix with high probability. Moreover, no stability under passing to approximately 
low rank matrices has been considered there. Our recovery result is therefore stronger than the one in [18], 
but requires more measurements. 
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1.2. Robust recovery of Hermitian matrices from rank-one projective measurements. Let us 

now focus on the particular case of recovering complex Hermitian n x n matrices from noisy measurements 
of the form (3), where the measurement matrices are proportional to rank-one projectors, i.e., 

Aj = ajQ* e 5f„ ( 10 ) 

where aj £ C". Here, denotes the space of complex Hermitian nx n matrices, which has real dimension 
n^. Measurements of that type occur naturally in convex relaxations of the phase retrieval problem [12, 10, 
28, 29]. In fact, suppose phaseless measurements of the form bj = |(x, aj)p of a vector x G are given. 
Then we can rewrite bj = tr(xx*aja*) = tv^XAj) as linear measurements of the rank one matrix X = xx*. 
We will expand on this aspect below in Section 2.1. Rank one measurements of low rank matrices feature 
prominently in quantum state tomography as well, see also below. 

The prior information that the desired matrix is Hermitian limits the search space in the convex opti¬ 
mization problem (4) and it simplifies to 

min Ill'll* subject to ||.A(2') — bW^^ < rj. (11) 

Arguably, the most generic measurement matrices of the form (10) result from choosing each aj to be an 
independent complex standard Gaussian vector. For the particular case of phase retrieval — i.e., where the 
matrix of interest X = xx* is itself proportional to a rank-one projector — uniform recovery guarantees by 
means of (11) have been established for m = Cn independent measurements in [13]. Recently, this result 
has been generalized to recovery of any Hermitian rank r-matrix by means of m = Cm such measurements 
in [43]. Our refined analysis of the null space property enables us to further strengthen this result by 
additionally guaranteeing stability under passing to approximately low rank matrices: 


Theorem 2. Consider the measurement process described in (1) with m measurement matrices of the form 
{10) ,where each Oi is an independent complex standard Gaussian vector. Fix r<n, 0<p<l and suppose 
that 

m > Cip~^nr. 


Then with probability at least 1 — e it holds that for any X G any solution X'^ to the convex 

optimization problem (11) with noisy measurements b = A{X) + e, where \\e\\i^ < rj, obeys 

II V vtt|| / II y II ' 


(i-rt 

Here, Ci, C 2 and C 3 denote positive universal constants. (In particular, for p = 0 and X of rank at most r 
one has exact reconstruction.) 


In addition to the Gaussian measurement setting, we also consider measurement matrices that arise 
from taking the outer product of elements chosen independently from an approximate complex projective 
4-design. Gomplex projective t-designs are finite sets of unit vectors in C" that exhibit a very particular 
structure. Roughly speaking, sampling independently from a complex projective t-design, reproduces the 
first t moments of sampling uniformly from the complex unit sphere. Likewise, approximate complex pro¬ 
jective t-designs obey such a structural requirement approximately — for a precise introduction, we refer 
to Definition 27 below. As a consequence, they serve as a general purpose tool for partially de-randomizing 
results that initially required Gaussian random vectors [42, 28]. This is also the case here and employing 
complex projective 4-designs allows for partially de-randomizing Theorem 2 at the cost of a slightly larger 
sampling rate. Here, we content ourselves with presenting and shortened version of this result and refer the 
reader to Theorem 28 where precise requirements on the approximate design are stated. 


Theorem 3. Let r, p be as in Theorem 2 and suppose that each measurement matrix Aj is of the form 
( 10 ), where aj, j = l,...,m, are chosen independently from a (sufficiently accurate approximate) complex 
projective j-design. If 

m > Cip~^nr\ogn, 

then the assertions of Theorem 2 remains valid, possibly with different universal constants. 


Note that Theorems 1, 2, 3 resp. Theorem 19 below and their proofs are presented in condensed versions 
in the conference papers [34] resp. [35]. 
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1.3. Recovery of positive semidefinite matrices reduces to a feasibility problem. Imposing ad¬ 
ditional structure on the matrices to be recovered can further strengthen low rank recovery guarantees. 
Positive semidefiniteness is one such structural prerequisite that, for instance, occurs naturally in the phase 
retrieval problem, quantum mechanics and kernel-based learning methods [61]. Motivated by the former, 
Demanet and Hand [21] pointed out that minimizing the nuclear norm — in the sense of algorithm (4) — 
can be superfluous for recovering positive semidefinite matrices of rank one. Instead, they propose to reduce 
the recovery algorithm to a mere feasibility problem and proved that such a reduction works w.h.p. for 
rank one projective measurements onto Gaussian vectors (the measurement scenario considered in Theorem 
2). Subsequently, this recovery guarantee was strengthened by Candes and Li [13]. Here, we go one step 
further and generalize these results to cover uniform and stable recovery of positive semidefinite matrices 
of arbitrary rank. Relying on ideas presented in [36], we establish the following statement. (We refer to 
Section 1.4 for the definition of the Schatten p-norm j[ • jjp used in (13).) 

Theorem 4. Fixr < n and consider the measurement processes introduced in Theorem 2 (Gaussian vectors), 
or Theorem 3 (complex projective j-designs), respectively. Assume that m > Cinr (in the Gaussian case) 
resp. m > C 2 snrlogn (in the design case), where s > 1 is arbitrary. Then, for 1 < p < 2 and any two 
positive semidefinite matrices X, Z € TCn, 

11^ - ^llp ^ ^ ^ ll^(^) - ^(X)l|,^ (13) 

holds universally with probability exceeding l — for the Gaussian case and 1 —e“®’’ in the design case. 

Here, Ci,..., Cs denote suitable positive universal constants. 

This statement renders nuclear norm minimization in the sense of (4) redundant and allows for a 
regularization-free estimation. Moreover, knowledge of a noise bound ^ V the measurement 

process (3) is no longer required, since we can estimate any X 0 by solving a least squares problem of the 
form (6), i.e., 

min \\A{Z) — h\\^ subject to Z ^ 0. (14) 

Z^J'Cn ^ 

Theorem 4 then in particular assures that the minimizer Z'^ of this optimization program obeys 
11Z» - Xllf < -^l|X,l|i + \\A{Z») - A{X)\\ < -^l|X,l|i + ^llu;l|,„ 

Vr y/m "^2 

where w G R™ represents additive noise in the measurement process. It is worthwhile to mention that if 

a matrix X of interest has rank at most r and no noise is present in the sampling process (3), Theorem 4 

assures 

{Z -. Z ^ 0, A(Z) = A{X)} = {X} (15) 

with high probability. Hence, recovering X from noiseless measurements indeed reduces to a feasibility 
problem. 

We emphasize that Theorem 4 is only established for rank one projective measurements. For the other 
measurement ensembles considered here — matrices with independent entries — one cannot expect such 
a statement to hold. This pessimistic prediction is due to negative results recently established in [63, 
Proposition 2]. Focusing on real matrices, the authors show that if the measurement matrices Aj are chosen 
independently from a Gaussian orthogonal ensemble, then estimating any symmetric, positive semidefinite 
matrix X via (14) becomes ill-posed, unless the number of measurements obeys 

m > ^n{n + 1) = O(n^). 

Finally, we want to point out that the fruitfulness of plain least squares regression for recovering positive 
semidefinite matrices was already pointed out and explored by Slawski, Li and Hein [63]. However, there is 
a crucial difference in the mindset of [63] and the results presented here. The main result [63, Theorem 2] 
of Slawski et al. assumes a fixed signal X 0 of interest and provides bounds for the reconstruction error 
in terms of geometric properties of both X and the measurement ensemble. Conversely, Theorem 4 assumes 
fixed measurements (e.g. m = Crn projectors onto Gaussian random vectors) and w.h.p. assures robust 
recovery of all matrices X :>= 0 having approximately rank-r simultaneously. 
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1.4. Notation. The Schatten p-norm of Z € is given by 

n \ 

^crj(Z)Pj , p>l, 

where o'j(Z), j = 1, ... ,n, denote the singular values of Z. It reduces to the nuclear norm || • ||» for p = 1 
and the Frobenius norm || • ||i? for p = 2. It is a common convention that the singular values of Z are 
non-increasingly ordered. We write Z = Zr + Zc, where Zj. is the best rank-r approximation of Z with 
respect to any Schatten p-norm of Z. 



2. Applications 


2.1. Phase retrieval. The problem of retrieving a complex signal x € C" from measurements that are 
ignorant towards phase information has long been abundant in many areas of science. Measurements of that 
type correspond to 

bi = \{ai,x)f + Wi i = (16) 

where oi,..., am G C" are measurement vectors and Wi denotes additive noise. Recently, the problem’s 
mathematical structure has received considerable attention in its own right. It is clearly ill-posed, since 
all phase information is lost in the measurement process and, moreover, the measurements (16) are of a 
non-linear nature. This second obstacle can be overcome by a trick [5] well known in conic programming: 
the quadratic expressions (16) are linear in the outer products xx* and ata*: 

bi = |(ai,a:)|^ -I- Wi = tr {{atai)* {xx*)) + Wi. (17) 

Note that such a “lift” allows for reinterpreting the phase-less sampling process as A{xx*) = b + w. Also, 
the new object of interest X := xx* is an Hermitian, positive semidefinite matrix of rank one. In turn, the 
measurement matrices Ai = aia* are constrained to be proportional to rank-one projectors. Consequently, 
such a “lift” turns the phase retrieval problem into a very particular instance of low rank matrix recovery — 
a fact that was first observed by Candes, Eldar, Strohmer and Voroninski [12, 10]. Subsequently, uniform 
recovery guarantees for m = Cn complex standard Gaussian measurement vectors at have been established 
which are stable towards additive noise. The main result in [13] establishes with high probability that for 
any X = xx*, solving the convex optimization problem (PhaseLift) 

min \\A{Z) — b\\t^ subject to Z Q (18) 

yields an estimator Z'^ obeying \\Z‘^ — xai *||2 < C'||w||i/to. If a bound ||iy ||^2 < p on the noise in the 
sampling process (16) is available, an extension of [43, Theorem 2] (see section 2.3.2 in loc. cit) establishes 
a comparable recovery guarantee via solving 

min ti{Z) subject to ||A(Z) — bWe^ < 17 , Z 0 (19) 

Z^J-Cn 

instead of PhaseLift. Our findings allow for establishing novel recovery guarantees for retrieving phases. 
Indeed, since (17) assures that any signal of interest is positive semidefinite and has precisely rank one, 
Theorem 4 is applicable and yields the following corollary. 

Corollary 5. Consider m > Cn phaseless measurements of the form (16), where eaeh Ui is a complex 
standard Gaussian vector. Then with probability at least 1 — e”*^ ^ these measurements allow for estimating 
any signal a: S C" via solving 


min 


\\A{Z)-b\\e, 


subject to Z 0. 


The resulting minimizer Z'^ of (20) obeys 


II - XX* Wi., < 


cWM\a 


( 20 ) 


where C denotes a positive constant and w € M"* represents additive noise in the sampling process (16). 

An analogous statement is true — with a weaker probability of success 1 — e“® for s > 1 — for m > 
C'sn log(n) rank one projective measurements onto independent elements of an approximate j-design. 
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This recovery procedure is in spirit very similar to (18), but it utilizes an ^ 2 -i'egression instead of an 
£i-norm minimization. Numerical studies indicate that algorithm (20) outperforms (19) as well as (18). 
These studies were motivated and accompany actual quantum mechanical experiments and will be published 
elsewhere [41]. 

Finally, we want to relate Corollary 5 to a non-convex phaseless recovery procedure devised by Candes, 
Li and Soltanolkotabi [14]. There, the authors refrain from applying the aforementioned “lifting” trick 
to render the phase retrieval problem linear. Instead, they use a careful initialization step, followed by 
a gradient descent scheme (based on Wirtinger derivatives) to minimize the problem’s least squares loss 
function directly over complex vectors z G C". Mathematically, such an optimization is equivalent to 
solving 

min ll.A(Z) — h\\i^ subject to Z rank(Z) = 1 (21) 

and the rank-constraint manifests the problem’s non-convex nature. Hence, the convex optimization problem 
(20) can be viewed as a convex relaxation of (21), obtained by omitting the non-convex rank constraint. 

2.2. Quantum information. In this section we describe implications and possible applications of our 
findings to problems in quantum information science. For the sake of being self-contained, we have included 
a brief introduction to crucial notions of quantum mechanics in the appendix. Quantum mechanics postulates 
that a finite n-dimensional quantum system is described by an Hermitian, positive semidefinite matrix X with 
unit trace, called a density operator. This “quantum shape constraint” assures that all density operators 
meet the requirements of Theorem 4. Furthermore, the rank-one projective measurements assumed in 
that theorem can be recast as valid quantum mechanical measurements — see [43, Section 3] for possible 
implementations and further discussion on this topic. Note, however, that such a reinterpretation is in general 
not possible for the measurement matrices with independent entries considered in Theorem I, because these 
matrices fail to be Hermitian. With Theorem 4 at hand, we underline its implications for two prominent 
issues in (finite dimensional) quantum mechanics. 

2.2.1. Quantum state tomography. Inferring a quantum mechanical description of a physical system is equiv¬ 
alent to assigning it a density operator (or quantum state) — a process referred to as quantum state tomog¬ 
raphy [6, 23]. Tomography is now a routine task for designing, testing and tuning qubits in the quest of 
building quantum information processing devices. Since the size of controllable quantum mechanical sys¬ 
tems is ever increasing^ it is very desirable to exploit additional structure — if present — when performing 
such a task. One such structural property — often encountered in actual experiments — is approximate 
purity, i.e., the density operator X is well approximated by a low rank matrix. Performing quantum state 
tomography under such a prior assumption therefore constitutes a particular instance of low rank matrix 
recovery [30, 24]. 

The results presented in this paper provide recovery guarantees for tomography protocols that stably 
tolerate noisy measurements and moreover are robust towards the prior assumption of approximate purity. 
In the context of tomography, results of this type so far have already been established for m = Cnrlog^n 
random (generalized) Pauli measurements [47, Proposition 2.3] via proving a rank-RIP for such measurement 
matrices and then resorting to [15, Lemma 3.2]. However, this auxiliary result manifestly requires additive 
Gaussian noise and using a type of Dantzig, or Lasso selector to recover the best rank-r approximation of a 
given density operator. This is not the case for the result established here, where performing a plain least 
squares regression of the form (14) is sufficient. 

Corollary 6. Fix r < n and suppose that the measurement operator A : "Kn —^ M’” is of the form 

™ j (n-\- l)n 

.A(X) = \ - {ai,Xai)eiw with m>Cirnlogn, 

V m 


^Nowadays, experimentalists are able to create and control multi-partite systems of overall dimension n = 2® in their 
laboratories [60]. This results in a density operator of size 256 x 256 (a priori 65 536 parameters). 
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where each ai G C" is chosen independently from an approximate 4-design and w G R.™ denotes additive 
noise. Then, the best rank-r approximation of any density operator X can be obtained from such measure¬ 
ments via solving 

min \\A(Z) — A(X)\\^ subject to tr(Z) = l. (22) 

With probability at least 1 — the minimizer Z'^ of this optimization obeys 

||X - Z»||i < CallXelli + (23) 

where Ci,C 2 ,C^ and C 4 denote positive constants. 

This statement is a direct consequence of Theorem 4. For the sake of clarity, we have re-scaled each 
projective measurement with This simplifies the resulting expression (23) and moreover facilitates^ 

direct comparison with the main result in [47], as it closely mimics the scaling employed there. 

Corollary 6 is valid for any type of additive noise and no a priori knowledge of its magnitude is required. 
This includes the particularly relevant case of a Bernoulli error model — see e.g. [17, Section 2.2.2] and 
also [24] — which is particularly relevant for tomography experiments. Also, note that the recovery error is 
bounded in nuclear norm, instead of Frobenius norm. Such a bound is very meaningful for tomography, since 
quantum mechanics is a probabilistic theory and the nuclear norm encapsulates total variational distance. 
Moreover, Helstrom’s theorem [32] provides an operational interpretation of the nuclear norm distance 
bounded in (23): it is proportional to the maximal bias achievable in the task of distinguishing the two 
quantum states X and Z\ provided that any physical measurement can be implemented. 

Finally, note that the bound on the probability of failure in Corollary 6 is much stronger than the one 
provided in Theorem 4. Such a strengthening is possible, because the trace of any density operator equals 
one. We comment on this in Remark 34 below. 

2.2.2. Distinguishing quantum states. One crucial prerequisite in the task of inferring density operators from 
measurement data, is the ability to faithfully distinguish any two density operators via quantum mechanical 
measurements. The most general notion of a quantum measurement is a positive operator valued mea¬ 
sure (POVM) M = {Em ■ Em ^ = Id} [53, Chapter 2.2]. A POVM M is called informationally 

complete (IC) [62] if for any two density operators X ^ Z G there exists Em G M C such that 

tr {EmX) ^ tr {EmZ). (24) 

This assures the possibility of discriminating any two quantum states via such a measurement in the absence 
of noise. Without additional restrictions, such an IC POVM must contain at least elements. However, 
such a lower bound can be too pessimistic, if the density operators of interest have additional structure. 
Approximate purity introduced in the previous subsection can serve as such an additional structural restric¬ 
tion: 

Definition 7 (Rank-r IC, Definition I in [31]). For r < n, we call a POVM M = {Emfm^i i~o,nk-r restricted 
informationally complete (rank-r IC), if (24) holds for any two density operators of rank at most r. 

Bounds for the number m of POVM elements required to assure rank-r-IC have been established in 
[31, 37, 38]. These approaches exploit topological obstructions of embeddings for establishing lower bounds 
and explicit POVM constructions for upper bounds. For instance, in [31] a particular rank-r-IC POVM 
containing m = 4r(n — r) — 1 elements is constructed. 

Focusing less on establishing tight bounds and more on identifying entire families of rank-r IC measure¬ 
ments, Kalev et al. [36] observed that each measurement ensemble fulfilling the rank-RIP for some r < n 
is also rank-r IC. This in particular applies with high probability to to = Clog® n nr random (generalized) 
Pauli measurements [47]. Theorem 4, and likewise Corollary 6, allow us to draw similar conclusions without 
having to rely on any rank-RIP. Indeed, in the absence of noise, these results guarantee for any rank-r 
density operator X 

{Z : ZipO, A{Z) = A{X)} = {A} (25) 

^In fact by resorting to the Frobenius norm bound in Theorem 4 (instead of the nuclear norm bound employed to arrive at 
Corollary 6), one obtains a performance guarantee that strongly resembles [47, Equation (8)] — the main recovery guarantee 
in that paper. 
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with high probability. If this is the case, the measurement operator A allows for uniquely identifying any 
rank-r density operator X. This in turn implies that A is rank-r IC and the following corollary is immediate: 

Corollary 8. Fix r < n arbitrary and let C, C he absolute constants of sufficient size. Then 

(1) Any POVM containing m = Cnr projectors onto HaaiA random vectors is rank-r IC with probability 
at least 1 — 

(2) Any POVM containing m = C'nr log n projectors onto random elements of a (sufficiently accurate 
approximate) A-design is rank-r IC with probability at least 1 — 

This statement is reminiscent of a conclusion drawn in [3, 48]: In the task of distinguishing quantum 
states, a POVM containing a 4-design essentially performs as good as as the uniform POVM (the union of 
all rank-one projectors). 

Remark 9. In the process of hnishing this article we became aware of recent work by Kech and Wolf [39], 
who showed that the elements of a generic Parseval frame generate a rank-r IC map if m > 4r(n — r). In 
fact, Xu showed in [68] that m > Ar{n — r) is both a sufficient and necessary condition for identifiability of 
complex rank r matrices in C”^". We emphasize, however, that these results are only concerned with pure 
identifiability and do not come with a practical and stable recovery algorithm. 


3. The null space property for low-rank matrix recovery 


Let X S jf X is only approximately of low-rank, then we would like to find a condition on the 

measurement map A : —>■ C"* that provides the control of the recovery error by the error of its best 

approximation by low rank matrices. Moreover, it should also take into account that the measurements 
might be noisy. 


Definition 10. We say that A : —>■ C™ satisfies the Frobenius robust rank null space property of 

order r with constants 0 < p < I and r > 0 if for all M G the singular values of M satisfy 

Vr 

The stability and robustness of (4) are established by the following theorem. 


Theorem 11. Let A : —>■ C™ satisfy the Frobenius robust rank null space property of order r with 

constants 0 < p < 1 and r > 0. Let n = min{ni,n2}. Then for any X G any solution X^ of (4) 

with b = A{X) -\- w, ||ry||f 2 ^ Vi approximates X with error 


||X-V#|[2< 


2(1+ P)^ 

(l-p)yF 


ll^clll 


2t{3 + p) 


Theorem 11 can be deduced from the following stronger result. 


Theorem 12. Let 1 < p < 2 and n = min{ni,n 2 }. Suppose that A : —>■ C™ satisfies the Frobenius 

robust rank null space property of order r with constants 0 < p < 1 and t > 0. Then for any X, Z G 

1!^ - ^IIp < - ll^lli + 2||^c||i) + _ X)||,^. (26) 

The proof requires some auxiliary lemmas. We start with a matrix version of Stechkin’s bound. 


Lemma 13. Let M G and r < min{ni, 71 , 2 }. Then, for p > 0, 


ll-^cjlp < 




^ Haar random vectors are vectors drawn uniformly from the complex unit sphere in C^. They can be obtained from 
complex standard Gaussian vectors by rescaling them to unit length. Property (25) is invariant under such a re-scaling and 
Theorem 2 therefore assures rank-r IC for both Gaussian and Haar random vectors. 
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Proof. This follows immediately from [26, Proposition 2.3], but for convenience we give the proof. Since the 
singular values of M are non-increasingly ordered, it holds 


n n 

ii^cii^= E E 

j—r-\-l j—r-\-l 




P-1 

n 

E 

j=r+l 



iiMiir^iiMiu 


MM. 

j'P—i 


□ 


The next result shows that under the Frobenius robust rank null space property the distance between 
two matrices is controlled by the difference between their norms and the ^ 2 -iiorm of the difference between 
their measurements. 


Lemma 14. Suppose that A : —>■ C"* satisfies the Frobenius robust rank null space property of order 

r with constants 0 < p < 1 and t > 0. Let X, Z G C"iX "-2 ^ _ niinjni, 712 }. Then 

||X - ^lli < (ll^lli - ll^lli + 2||X,||i) + ^\\AiX - Z)\U,. 


Proof. Theorem 7.4.9.1 in [33] states that for matrices A, B of the same size over C 

\\A-B\\ > l|E(kl)-E(i?)l|, 

where jj • jj is any unitarily invariant norm and E(-) denotes the diagonal matrix of singular values of its 
argument. Hence, 

n 

liziii = i|x - (X - z)iii > M - ^)l 

1=1 

r n 

= J2\<TAX)-afX-Z)\+ WjiX)-a,{X-Z)\ 

i=l j=r+l 

r n 

>J2{a,{X)-afX-Z))+ M iajiX-Z)-a,iX)). 

i=l j=r+l 


Hence, 

n r r 

\\{X-Zfh= M ofX-Z)<\\Z\\^-Y,a,{X) + Y,afX-Z) + \\X,\\, 

j=r+l j=l 3 = 1 

< 11zi1i-1|xi1i + V^ii(x-zvi|2 + 211x,i|i. 

Applying the Frobenius robust null space property of A we obtain 


||(x - Z),l|i < llZlli - llAlli + p11(A - Zfh + rV^WAiX - Z)\U, + 2llA,l|i. 


By rearranging the terms in the above inequality we obtain 

||(A - Z),l|i < ^ (llZl]i - IjAlli + tMMX - Z)\U, + 2l]A,l|i) . 

In order to bound j]A — Z\\i we use Holder’s inequality, the Frobenius robust rank null space property of A 
and the inequality above, 

1]A - Zlli = 11(A - Z).l|i + 11(X - ZflU < ^|1(A - Zfh + ||(X - Zfh 

< (1 + p)ll(X - Z),l]i + t^\\A{Z - X)l|,, 

< (llZl]i - llAl]i + t^\\A{X - Z)\U, + 2llA,l|i) + t^\\A{X - Z)\\,, 

= \±^ (llzlli - llAlli + 2llA,lli) + p^\\A{X - Z)l|,,. 

1-p 1-p 
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This concludes the proof. □ 

Now we return to the proof of the theorem. 

Proof of Theorem 12. By Holder’s inequality, Lemma 13 and the Frobenius robust rank null space property 
of A 

HZ - X||p < ||(X - Z)rh + ll(^ - ^)c||p < rl/T-l/2||(X _ Z)rH 2 + ||(X - Z^lp 
< - ZUU + rri/P-i/2||yi(x - Z)||,, + ^j^\\X - Z||i 

^ - Z)IU,. (27) 

Substituting the result of Lemma 14 into (27) yields the desired inequality. □ 

As a corollary of Theorem 12 we obtain that if X S ([;;«ixr !,2 jg ^ matrix of rank at most r and the 
measurements are noiseless {rj = 0), then the Frobenius robust rank null space property implies that X is 
the unique solution of 

min Ill'll 1 subject to A(Z) = 6. (28) 

It was first stated in [-57] that a slightly weaker property is actually equivalent to the successful recovery of 
X via (28). 

Theorem 15 (Null space property). Given A : ^ every X € of rank at most r is the 

unique solution of (28) with b = A{X) if and only if, for all M G kerA \ {0}, it holds 

\\Mr\\i < ||M,||i. (29) 

For the proof we refer to [57] and [26, Chapter 4.6]. According to Lemma 14, another implication of the 
Frobenius robust rank null space property consists in the following error estimate in jj • jji for the case of 
noiseless measurements, 

1 - p 

The above estimate remains true, if we require that for all M G ker A, the singular values of M satisfy 

||M,l|i <p11M,1|i, 0<p<1. 

This property is known as the stable rank null space property of order r with constant p. It is clear that if 
A : —>■ C"’' satisfies the Frobenius robust rank null space property, then it satisfies the stable rank 

null space property. The approach used in [54] to verify that the stable null space property accounts for 
stable recovery of matrices which are not exactly of low rank, exploits the similarity between the sparse 
vector recovery and the low-rank matrix recovery. It shows that if some condition is sufficient for stable and 
robust recovery of any sparse vector with at most r non-zero entries, then the extension of this condition to 
the matrix case is sufficient for the stable and robust recovery of any matrix up to rank r. 

In order to check whether the measurement map A : —>■ C™ satisfies the Frobenius robust rank 

null space property, we introduce the set 

Tp,r ■■= |m G : IIMII 2 = 1, 1]M,1|2 > ^ll^cllij . 

Lemma 16. If 

inf{llA(M)l|,, : M G Tp.J > -, 

r 

then A satisfies the Frobenius robust rank null space property of order r with constants p and t. 

Proof. Suppose that 

inf{llA(M)lU, : M G Tp,,} > -. 

T 


( 30 ) 
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It follows that for any M € such that ||^(M)||f 2 < it holds 

For the remaining M S c«ix’i 2 -^vith ||yi(M )||£2 > we have 

||M,|| 2 < ||M|| 2 <r||yi(M)||,,. 

Together with (31) this leads to 


( 31 ) 


llM.||2<-^||M,||i+r||yi(M)|k. 

Vr 


for any M € 


□ 


It is natural to expect that the recovery error gets smaller as the number of measurements increases. This 
can be taken into account by establishing the null space property for r = Then the error bound reads 
as follows 

II w wUii ^ II w II I 2 k(3 + p) 

11 ^ - A»||2 < - -^IIXclli + -^=- - rT]. 

An important property of the set is that it is imbedded in a set with a simple structure. The next 
lemma relies on the ideas presented in [-59] for the compressed sensing setting. 


Lemma 17. Let D be the set defined by 

D := conv{M S : ||M ||2 = l,rankM < r] 

where conv stands for the convex hull. 

(a) Then D is the unit ball with respect to the norm 

1/2 


(32) 


\\M\\d-.= Y. 




i£li 


where L= 


(b) It holds 


_/ {?'(j - 1) +1, ■ ■ ■ ,n} , j = 
{r(L - 1) + 1,... ,n} , j = L. 


(33) 


1/2 


Tp,. C ^1 + {1 + p -^ fD . 

Let us argue briefly why || • Hd is a norm. Define g : C" —>■ [0, oo) by 

9 {x) ■■= ^ [ X! 

1=1 

where L and Ij are defined in the same way as in item (a) of Lemma 17. Then g is a symmetric gauge 
function and ||Af Hu = g{a{M)) for any M G The norm property follows from [33, Theorem 7.4.7.2]. 

Proof of Lemma 17. (a) Any M G D can be written as 

M = Y, OirX, 


with 


rankAi < r, ||Ai ||2 = 1, cii > 0, ^ = 1. 

i 

||A7||u < '^aiWXiWo = '^ai\\Xt\\2 = = 1. 


Thus 
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Conversely, suppose that ||M||£) < 1, and let M have a singular value decomposition M = UT,V* = 

L 

cri{M)uiV*, where Ui G and Vi G C”^ are column vectors of U and V respectively. Set Mj := 

j=l iG/j 

o'i{M)uiV* and aj := ||Mj|| 2 , j = Then each Mj is a sum of r rank-one matrices, so that 

iG/j 

rankikf,- < r, and we can write M as 


m=y. -"j 


r-ajito 


with 


^ a, =^||M ,||2 = ||M||, 5<1 and ||^M ,||2 = ;^||M ,||2 = 1. 

Ct •1 (Ji 'i 

jiaj^O j J J 

Hence M G D. 

(b) To prove the embedding of Tp ^ into a scaled version of D, we estimate the norm of an arbitrary 
element M of Tp.r- According to the definition of the || • Hu-norm 


\\M\\d = Y. 


iele 


= \\Mrh 






2 L 


E 

f>3 


.i&Ie 


To bound the last term in the inequality above, we first note that for each i G le, £ > 3, 

< - E W 


j&ii-i 


and hence 


Summing up over i > 3 yields 

L 


.i&Ie 


1/2 


< 


^Tr 


E 




E 

^>3 


.i&h 


< 


1 _ _ 1 "• 1 

l>2 j£lt V 


j=r+l 


and taking into account the inequality for the singular values of M G Tp^r 


E 

e>3 


.i&Ie 


Applying the last estimate to (34) we derive that 


\\M\\d < {l + p-^)\\Mrh+ 


2r 




.z=r+l 


<p-lMrh. 


<(1+P ^)l|Afr||2 + (l - ||Afr||2) ■ 


Set a = ||AA.|| 2 . The maximum of the function 

/(a) := (1 -I- p~^)a + \/\ — a?, 0 < a < 1, 

is attained at the point 

1 -I- p~^ 


^l + (l + p-l)2 


and is equal to -I- (1 -I- p~^Y- Thus for any M G Tp^r it holds 
which proves (33). 


IIMlIzi < vTTTT+E^, 


(34) 


□ 
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Remark 18. The previous results hold true in the real-valued case and in the case of Hermitian matrices, 
when the nuclear norm minimization problem is solved over the set of matrices of that special type. As a 
set D we then take the convex hull of corresponding matrices of rank r and unit Frobenius norm. The only 
difference in the proof of Lemma 17 occurs at the point, where we have to show that any M with ||M||£) < 1 

L 

belongs to D. Say, M G C”><" is Hermitian and ||M||£) < 1. Then M = UAU* = cri{M)uiU*, where 

j=i ieij 

Ui G C", and Mj := ^ ai{M)uiU* is Hermitian. The rest of the proof remains unchained. 

i^ij 

Employing the matrix representation of the measurement map A, the problem of estimating the prob¬ 
ability of the event (30) is reduced to the problem of giving a lower bound for the quantities of the form 
inf IIAxil 2 . This is not an easy task for deterministic matrices, but the situation significantly changes for 

x^T 

matrices chosen at random. 


4. Gaussian measurements 

Our main result for Gaussian measurements reads as follows. 


Theorem 19. Let A : ^/jg linear map (1) generated by a sequence Ai,. 

dent standard Gaussian matrices, let 0 < p < 1, k > 1 and 0 < e < 1. If 


m 


m -|- 1 


> 


r(l + (l + p-i)2)At2 

(«-l)2 


\/^ + \/t^- 


2 ln(e“i) 


r(l -I- (1 -I- p~^Y) 


., Am of indepen- 


(35) 


then with probability at least 1 —e, for every X G a solution X'^ of (f) with b = A{X)-\-w, ||ic||f 2 < ry, 

approximates X with error 


||X-A#||2< 


2(1+ P)^ 
(l-p)Vr 


ll^cIliT 


2kV 2{3 + p) 
\/w(l - p) ^ 


In order to prove Theorem 19 we employ Gordon’s escape through a mesh theorem that provides an 
estimate of the probability of the event (30). First we recall some definitions. Let g G M”’' be a standard 
Gaussian random vector, that is, a vector of independent mean zero, variance one normal distributed random 
variables. Then for 


Em := E||g||2 


^ r((m+l)/2) 

r(m/2) 


we have 


see [27, 2G]. For a set T c 


in ,— 

, < Em < Vm, 

y/m + 1 

we define its Gaussian width by 


e{T) := Esup(x,g), 

xGT 


where g G M" is a standard Gaussian random vector. 


Theorem 20 (Gordon’s escape through a mesh [27]). Let A G be a Gaussian random matrix and T 

be a subset of the unit sphere Then, for t > 0, 


inf ||Ax ||2 > Ej, 

xGT 


- f-[T) -t] > 1 - 


(36) 


In order to apply this result to our measurement process (1) we unravel the columns of Aj, j = 1, ..., m, 
into a single row and collect all of these in a m x nin 2 -Hiatrix A, so that n = nin 2 when applying (36). 
In order to give a bound on the number of Gaussian measurements. Theorem 20 requires to estimate the 
Gaussian width of the set from above. As it was pointed out in the previous section, Tp^r is a subset of 
a scaled version of D, which has a relatively simple structure. So instead of evaluating £{Tp^r), we consider 
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Lemma 21. For the set D defined by (32) it holds 

£{D) < ^/r{^/n( + 


(37) 


Proof. Let F G have independent standard normal distributed entries. Then £{D) = E sup (F, M). 

M&D 

Since a convex continuous real-valued function attains its maximum value at one of the extreme points, it 
holds £(£))= E sup (F, M). By Holder’s inequality, 

||M||2 = 1 
rank M<.r 


£{D) < E sup II 
||M||2 = 1 
rank M<.r 


,||M||i < sup ||M|| 2 Ecri(F) < \/n 2 ), 

||M||2 = 1 
rank M<.r 


where the last inequality follows from an estimate for the expectation of the largest singular value of a 
Gaussian matrix, see [26, Chapter 9.3]. □ 


Proof of Theorem 19. Set t := If m satisfies (35), then 

Em ^1 - > \/r{l + (1 + p“^)^)(\/ni + y/n^) + t. 


Together with (33) and (37) this yields 


According to Theorem 20 


Em - £{T,,r) - t > — > ^ ^ 

K K 


> 1-e, 


P inf ||A(M)|| 2 > /- 


which means that with probability at least 1 — s map A satisfies the Frobenius robust rank null space 
property with constants p and The error estimate follows from Theorem 11. □ 

5. Measurement matrices with independent entries and four finite moments 

In this section we prove Theorem 1, which is the generalization of Theorem 19 to the case when the 
map A : —>■ R.™ is obtained from m independent samples of a random matrix $ = with the 

following properties: 

• The Xij are independent random variables of mean zero, 

• EAj^. = 1 and EX)^- < C 4 for all i,j and some constant C 4 . 

Note that (by Holder’s inequality) C 4 > 1 . 

As before the idea of the proof is to show that the event (30) holds with high probability. In order to do 
so we apply Mendelson’s small ball method [40, 50, 66] in the manner of [66]. 


Theorem 22 ([40, 50, 66]). Fix E C 
For ^ > 0 let 


and let (fi,..., fim be independent copies of a random vector 4 > in 
Q^{E-,cf)= inf P{I(</.,«)!> a 

u^E 


and 


Wm {E] 4 >) =E sup {h, u ), 

ueE 

where h = with {sj) being a Rademacher sequence Then for any ^ > 0 and any t > 0 with 

probability at least 1 — 

/ m \ 1/2 

inf ( ^ \{(t>i,u)\'^ ) > f,^fmQ 2 ^{E] 4 >) - 2 Wm{E](j)) - (t. 

u^E ' • 




We start with two lemmas. 


the ej are independent and assume the values 1 and —1 with probability 1/2, respectively. 
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Lemma 23. 


where C 5 = inax{3, < 74 }. 


inf P(|($,r)| > ^) > ^, 

{y,||y||2=i} V2 4(75 


Proof. Assume that Y has Frobenius norm one. The Payley-Zygmund inequality (see e.g. [26, Lemma 7.16], 
and also [ 66 ]), implies 


PIK®,DP > 1(11(4,DP)) > 1, *E|(4,y)'p' ' 

We compute numerator and denominator. 

E|($,y)p = ^ nx.jXu) ■ Y,,Yki = =Y,ylo = 1- 

i,j i,j 

Likewise, 

EK$, r) |4 = ^ e(w ,,4 • • ■ w,,,) ■ • • • y,,,, 

i,j il,12^31 ^32 

= Y^EXf^-Y^^+3 Y1 E 

i,i 2,7 

(‘^lJl)i^(i2^32) 

<^5 E ^4^4 = ^5(E^*i)' = C'5. 

h42,ji,j2 ij 

Combining this with (E|($,y)p)^ = 1 and the estimate (38), the claim follows. 


(38) 


2 

i 2 j 2 


•>■1,^2^31,32 

(''■l,3l)7^(-i2,32) 


□ 


Lemma 24. Let $i,...,$m be independent copies of a random matrix $ as above. Let ei,... ,em be 
independent Rademacher variables independent of everything else and let H — ^k^k- Then 

E||i7|loo < Ciy/n. 

Here Ci is a constant that only depends on (74. 

Proof. Let S = ^k- We first desymmetrize the sum H (see [45, Lemma 6.3]) and obtain 

E||iJ||oo < ^E||,7||oo. 

\/m 

Therefore, it is enough to show that E||S'||oo < csy/mn for a suitable constant C 3 . The matrix S has 
independent mean zero entries, hence by a result Latala (see [44]) the following estimate holds for some 
universal constant ( 72 , 

EIISIU < G /^ESP + max^^ESJ + J^ESjj . 

Denoting the entries of by Xk;ij, we have Sij = Xk-ij- Hence, using the independence of the Xk-ij, we 
obtain ESf^ = E(X)fc Xk-ijY = Sfe = m. Thus, ^ \/nm for any i and ^ y/fvm 

for any j. Finally to estimate E5'4 we calculate £5*4 = E(^j, )^. Using again that the Xk;ij 

are independent and have mean zero we obtain 

ESf^ = ^EA^,^. +3 ^ EXl.^.^EXl.^,^. 

k ki^k2 
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Using that = 1 for all i,j, k, we obtain ESf^ < C^m?, where C 5 = inax{3, C^} and hence 


/ y^ ESfj < \/C^rn^n^ = ^fcly/rrm. 


Hence, indeed E| 151100 < C 3 ^mn for a suitable constant C 3 that depends only on C^. 


□ 


Proof Theorem 1. Let now Tp^r and D be the sets defined in Section 3, but restricted to the real-valued 
matrices. By Holder’s inequality, for any ni x n 2 matrix Y of Frobenius norm 1 and rank at most r and 
any ni x 712 matrix iL, 

{h,y)<\\yuh\\^<Mh\u 

Hence 

sup(iF,r) < v^lliflloo. (39) 

Yen 

Let H = and let ^ and E = Tp^r- Then it follows from Theorem 22 that for any t > 0 

Q .2 

with probability at least 1 — e~ 


inf 

YeTp^r 


1/2 




> 


m 




1 


Using Lemma 23 and the fact that all elements of Tp^r have Frobenius norm 1, we obtain 


QiiTpy,<^) > 

V2 


1 

4^' 


(40) 


(41) 


Combining now the fact that Tp^r U + {1 + p~^)'^D (see Lemma 17) with estimate (39) and Lemma 24 
leads to 

Wm{Tp^r,<^) < v'l + (l + p-i)2v^ E||Lf||oo < Civ'l + (l + p-i)2V^V^- (42) 

Using (40), (41) and (42) we see that choosing m > cip~^nr and t = c^m for suitable constants ci,C 4 , we 
obtain with probability at least 1 — 


inf 

YeTp,r 




1/2 


> C3\/to 


\i=l 


for suitable constants 02 , 03 . Now the claim follows from Lemma 16 and Theorem 11 (both of which also 
hold in the real valued version by the same proofs respectively). □ 


6. Rank one Gaussian measurements 


In this section we prove Theorem 2. The proof technique is an application of Mendelson’s small ball 
method analogous to the proof of Theorem 1. Let 




M G 1K„ : IIMII2 


1, ||M,||2 > 

yr 



Let Tp^p be defined as but with replaced by the set of all complex n x n-matrices (i.e. it is defined 
as before with n\ = n 2 = n). Then C Tp^p. It is enough to show that with high probabiliy 


inf 

YeTj^p 


1/2 


|(a,a*,r)| = 


> TmjC'i 


(43) 


We apply Theorem 22 with E — . The next lemma estimates the small ball probability Q^iE^cf) used 

C2 

in Mendelson’s method. 


Lemma 25 (see [43]). Q^{E](j)) := inf„g£: P{|(aa*, u)| > 









18 


MARYIA KABANAVA\ RICHARD KUENG^’^'^, HOLGER RAUHUT\ ULRICH TERSTIEGE^ 


Let now (as in [66, 43]) 

. m 

(44) 

i=i 

where the Sj form a Rademacher sequence. For any M S !H„ and any n x n matrix Y of Frobenius norm 1 
and rank at most r 

{M,Y)<\\Yh\\M\\^<V^\\M\U 
Since E = C Tp^r C + (1 + p~^)‘^D, this implies 

Wm{E,(l)) = E sup(iL,y) < v^l + (1 + p-i)2v^E||iJ||oo- 

YeE 

As in [43] we use now that by the arguments in [67, Section 5.4.1] we have E||i7||oo < C 2 ^/n if m > c^n for 
suitable constants C 2 ,C 3 , see also [66, Section 8]. Now the claim of Theorem 2 follows from Theorem 22, 
comp, the proof of Theorem 1. □ 

Remark 26. Inspecting the above proof, resp. the proofs of the cited statements in [43], we see that the real 
valued analogue of Theorem 2 is also true. We even may assume for this that the aj are i.i.d. subgaussian 
with A:-th moments, where k < 8 , equal to the corresponding fc-th moments of the Gaussian standard 
distribution. The constants then depend only on the distribution of the aj. We also note that a similar 
statement in the real case for the recovery of positive semidefinite matrices using subgaussian measurements 
has been shown by Chen, Chi and Goldsmith in [19] using the rank restricted isometry property. 


7. Rank one measurements generated by 4-designs 


Recall the definition of an approximate, weighted t-design. 

Definition 27 {Approximate t-design, Definition 2 in [3]). We call a weighted set {Pi,u)i}f^^ of normalized 
vectors an approximate t-design of p-norm accuracy Op, if 


Pi {WiWi ) 


i=l 



dw 



(45) 


A set of unit vectors obeying = 0 for 1 < p < oo is called an exact t-design, see [62] and also [43, 28]. 


Theorem 28. Let {Pi,Wi}^^j^ be a an approximate 4-design with either Ooa < l/(16r^), or 0\ < 1/4 that 
furthermore obeys '^f-iPiWiW* — ^ id ^ Suppose that the measurement operator A is generated by 


m>C 4 p ^nrlogn 

measurement matrices Aj = yjn{n -\- l)aja*, where each aj is drawn independently from {pu^^il^i- Then, 
with probability at least 1 — , A obeys the Frobenius robust rank null space property of order r with 

constants 0 < p < 1 and r = Cejy/m. Here, C 4 , C 5 and Cq denote positive constants depending only on the 
design. 


Theorem 3 readily follows from combining this statement with Theorem 12. 


Proof of Theorem 28. We start by presenting a proof for measurements drawn from an exact 4-design. 
Paralleling the proof of Theorem 2, the statement can be deduced from Theorem 22 by utilizing results from 
[43]. Provided that a is randomly chosen from a re-scaled, weighted 4-design (such that each element has 
Euclidean length llwiHfj = yRn + ljn), [43, Proposition 12] implies that 


inf P (jtr {aa*Z) I > 6 > „ jnf P (jtr {aa*Z) | > ^ > 

^€lp,r ||Z||2 —1 


(1-C2)2 


24 


( 46 ) 
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is valid for all ^ G [0,1]. Now let H = YllLi be as in Theorem 22. Lemma 17 together with the fact 

that D is the convex hull of all matrices of rank at most r and Frobenius norm 1 allows us to conclude for 
m > 2nlogn, that, 

Ibm (7p,r, aa*) = IE sup tv{HM) < \/l + (1 + p~^)^ E sup tr (iJM) 

MeTp^r Men 

< v'l + (l + p-i)2 sup ||M||iE||i7|loo < Vl + (l + p-i)2VFE||iJ||oo 

M^D 

< 3.1049x/l + (1 + p-i)2rn log(2n), 

where the last bound is due to [43, Proposition 13]. Fixing 0 < ^ < 1/2 arbitrarily and inserting these two 
bounds into Theorem 22 completes the proof. 

An analogous statement for approximate 4-designs — with slightly worse absolute constants — can be 
obtained by resorting to the generalized versions of [43, Propositions 12 and 13] presented in Section 4.5.1 
in loc. cit. which are valid for approximate 4-designs that satisfy the conditions stated in Theorem 28. □ 

8. The positive semidefinite case 

Finally, we focus on the case, where the matrices of interest are Hermitian and positive semidefinite and 
establish Theorem 4. In order to arrive at such a statement, we closely follow the ideas presented in [36] 
which in turn were inspired by [9] containing an analogous statement for a non-negative compressed sensing 
scenario. 

We require two further concepts from matrix analysis. For every positive semidefinite matrix IF ^ 0 with 
eigenvalue decomposition IF = KwiW* we define its square root to be IF^^^ := In 

other words, IF^/^ is the unique positive semidefinite matrix which acts on the eigenspace corresponding 
to the eigenvalue \i of IF by multiplication by \/A/. Note that this matrix obeys IF^/^ • IF^^^ = IF. Also, 
recall that the condition number k(IF) of a matrix IF is the ratio between its largest and smallest nonzero 
singular value. For an invertible Hermitian matrix with inverse IF“^ this number equals 

k{w) = \\w\u\w-^\U 


Suppose that the measurement process (3) is such that there exists t G which assures that 

positive definite. We define the artificial measurement map 

IF := 

A^yl/2 : Jfn R.™, 

Z ^ AiW-^^^ZW-^/^) 

(47) 

and the endomorphism 



Z^ Z := 

= IF^/^ZIF^/^ 

(48) 

of TCn ■ Note that these definitions assure 



A{Z') = Ayf\ii{Z') for all Z G Jfn 

(49) 

and the singular values of Z and Z satisfy 



a,iZ)<\\W^/ma,iZ) = \\W\Ua,iZ), 

<Jj{Z) < ||IF-i/2||^a,(Z) = ||IF-iooa,(Z), 

(50) 


see [7, p. 75]. Consequently, the mapping (48) preserves the rank of any matrix. The following result assures 
that the artificial measurement operator obeys the Frobenius robust rank null space property, if the 

original A does. 


Lemma 29. Suppose that A satifies the Frobenius robust rank null space property of order r with eonstants 
p and T and suppose that W = positive definite. Then also obeys the Frobenius robust 

rank null space property of order r, but with constants p = K,{W)p and r = ||IF||ooT. 

Proof. Let Z G fK„. Relations (49), (50) together with the Frobenius robust rank null space property of A 
imply that 

ll^.lb < < IlH^lloo +r||A(Z)||,,) 

< + \\W\\p^T\\A^,n{Z)U. 

y/r 
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□ 


Lemma 30. Suppose there is t G R."* such that W := *5 positive definite. Let X, Z be positive 

semidefinite. Then, 

Proof. The claim follows from positive semidefiniteness of both Z and X and our choice of the endomorphism 
(48). Indeed, 

||Z||i = tr{Z -X) + ||X||i = tr(lTi/2(Z - X)W^/^) + ||l||i = tr(IT(Z - X)) + ||X||i 

m 

= Y^t, tr{A,{Z - X)) + ||l||i = {t,A{Z - X)) + llXlli 
i=i 

= {t,A^^/.{Z - X)) + ||X||i < \\t\Ufi\A^ufiZ - 1)11,, + ||l||i. 

Here X resp. Z denote the preimage of X resp Z under the map (48). □ 


This simple technical statement allows us to establish the main result of this section. 


Theorem 31. Suppose there exists t G R.™ such that W := positive definite and A satisfies 

the Frohenius robust rank null space property with constants 0 < p < and r > 0. Let 1 < p < 2. Then, 

for any X, Z 0, 


\\Z-X\\p< 


2Ck{W) 

ri-i/p 


ll^clli + r^^P-^/^A{Z) - A{X)\Ufi\W-^\\ 


with constants C = 


and D — 


3+k(IU)p 

l-K(IU)p- 



+ 11||IT|| 



(51) 


Proof. Let X, Z ip 0 he arbitrary. Then 


iz-xiip = 


< \\w-^u\\z-nr 


holds and the resulting matrices Z, X are again positive-semidefinite. Also, since A satisfies the Frobenius 
robust rank null space property with constants 0 < p < and r > 0, Lemma 29 assures that A^\i 2 does 
the same with constants 0 < p < 1 and f = ||VF||ooT > 0. Combining this with Theorem 12 and Lemma 30 
implies 


IIZ- l||p < (ll^lli - ||l||i + 2||l,||i) + D\\W\\^Tr^/P-^/^A^^^fiZ - 1)||,, 

^ fSiJf {mUMw-MZ - l)lk. + 2||l,||i) + D\\W\\^Tr^/P-^/^\\A^^MZ - l)lk. 

The desired statement follows from this estimate by taking into account (49) and (50). □ 


Note that in contrast to other recovery guarantees established here. Theorem 31 does not require any 
convex optimization procedure. However, it does require the measurement process to obey an additional 
criterion: the intersection of the span of measurement matrices with the cone of positive definite matrices 
must be non-empty. We show that this is the case for the rank-one projective measurements introduced 
in the previous section with high probability. Since it has already been established that sufficiently many 
measurements of this kind obey the Frobenius robust rank null space property with high probability (see 
Theorems 2 and 28 and their respective proofs). Theorem 4 can then be established by taking the union 
bound over the individual probabilities of failure. 


Proposition 32. Suppose m > 4n and let Ai ,..., Am be matrices of the form OjO*, where each Oi G C" is 
a random complex standard Gaussian vector. Then with probability at least 1 — 2e“‘^^“'", IT := A Aj 

is positive definite and obeys 


max{||lT||oo, ||1T ^||oo, r(VF)} < Cn. 


(52) 
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Here, Cg,Cw,Cii > 0 denote universal positive constants. 

Note that such a construction corresponds to setting K™ which obeys ||t ||^2 = 


Proof. For the sake of simplicity, we are going to establish the statement for real standard Gaussian vec¬ 
tors. Establishing the complex case can be done analogously and leads to slightly different constants. Let 
ei,..., Cm denote the standard basis in R”^. We define the auxiliary m x n matrix A := which 

obeys 


.. ..mm 

lA^A=-ya.e*y, 




■E 


didi 


= -yA. = w. 

m ( ^ 


Also, by construction, A is a random matrix with standard Gaussian entries. Essentially, this relation 
implies that mW is Wishart-distributed. From ( 8 ) and the defining properties of eigen- and singular values 
we infer that 





\J Amin (A^A) 





(53) 


and an analogous statement is true for the largest eigenvalue Aniax(bF)- Since A is a Gaussian mxn matrix, 
concentration of measure implies that for any f > 0 


Vm- V^- T < Crmin(A) < Crmax(A) < -v/to + \/« + 'r (54) 

with probability at least 1 — 2e~'^ — see e.g. [67, Gorollary 5.35] or [26, Theorem 9.26]. Combining this 

with (53), recalling the assumption m > 4n and defining r = fjy/m allows for establishing 


^ - r < 1 - W — - T < \/Ai„in(hF) < \/Ai„ax(fF) < 1 + \ — + r < ^ + T 
z \ m \ m Z 

with probability at least 1 —2e“”^'^^/^. This inequality chain remains valid, if we square the individual terms. 
Setting T = 1/4 thus allows us to conclude 

< (^)A49 = Cn, (55) 

with probability at least 1 — 2 e“™/^^. □ 


Alternatively, we could have relied on bounds on the condition number of Gaussian random matrices 
presented in [20]. While these bounds would be slightly tighter, we feel that our derivation is more illustrative 
and it suffices for our purpose. 


Proposition 33. Suppose m > C^nrlogn and let Ai,..., Am be matrices of the form OjO*, where eaeh 
Oj £ C" is chosen independently from a weighted set of vectors obeying = yjn{n + 1) for 

all 1 < i < N and 


N 

ypiWiw* 

2=1 




(56) 


Then with probability at least 1 — e , the matrix kF := A Aj is positive definite and obeys 


max{||W||oo, IjkF loo, k{W)} < 8 . 


(57) 


Here, C 4 > 1 and 0 < 7 < 1 denote absolute constants of adequate size. 


Note that condition (56) is slightly stronger than the corresponding condition in Theorem 28. Also, the 
construction of W again uses t = A ^ a) £ R"*. 
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Proof. In order to show this statement, we are going to employ the matrix Bernstein inequality® [65, Theorem 
6 .1], see also [1], in order to establish 



/n + 1 , 

IT- 1 



V n 


3 

< - 

- 4 


(58) 


with high probability. Let Ai(lT),..., A„(IT) denote the eigenvalues of W. Then such a bound together 
with the definition of the operator norm assures 


1 - A„u„(W^) < 


n + 1 


n 


~ Aniin(lL) < 


n + 1 


n 


-A„,in(bL) 


< max 

l<2<n 


n + 1 


n 


-A, (IT) 


n + 1 


id-IT 


<3/4, 


Ar„ax(VT) - 


n + 1 


< 


Amax(IT) - 


n + 1 


< max 

l<z<n 


n + 1 


-A, (IT) 


W- 


n + 1 


id 


< 3/4. 


This in turn implies Amin(IT) > 1/4 as well as Ai„ax(IT) < 3/4+ y < 2 for n > 2 and the desired bound 
(57) readily follows. 

It remains to assure the validity of (58) with high probability. To this end, for 1 < fc < m, we define the 
random matrices Mk := ^ (ofeOfe ~ E [afcO^]), where each ak is chosen independently at random from the 
weighted set {Pi,Wi}f^^. This definition assures 


IT- 1 

/n+1 , 


/-id 

= 


V n 

oo 


k=l 


{Mk + E [okal] ) - J nil id 


< 


Y^Mk 


k^l 


via the triangle inequality and assumption (56) and along similar lines 


l|EMIL< 2 


1 n + 1 


< 2 


(59) 


(60) 


readily follows for any 1 < k < m. The random matrices Mk have mean-zero by construction and each of 
them obeys 


1 


1 


ll^felloo = — lla/cOfe -E[afe4]||^ < — max{||afe4||oo,||E[afe4] ||oo} = —||afc||^^ = 


virT+iy 


as well as 


® M IL = [i^kalf] - E [akalY 


2 


y(n + l)nE [afcOfc] - E [akolY 
max ||E [afc4]||^ , ||E [afe4]||l,| 


|2 \ ^ 2 y(n + l)n 


Hence 






< 


2 y (n + l)n 


^Resorting to the matrix Chernoff inequality would allow for establishing a similar result. However, in the case of an exact 
tight frame, the numerical constants obtained by doing so are slightly worse. 
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These bounds allow us to set R := cr^ := and apply the matrix Bernstein inequality 

([65, Theorem 6.1], [1]) in order to establish 


Pr 




fc=l 


> r 


< n exp I — 


72 


Rt 


< n exp — 




16y^{n + l)n 


for 0 < r < CT^/i? = 2. Setting r = 1/4 and inserting m > C 4 nr\og{n) (where C 4 is large enough) assures 
that (58) holds with probability of failure smaller than via (59) for a suitable 7 > 0. □ 


Finally, we are ready to prove Theorem 4. 


Proof of Theorem We content ourselves with establishing the design case and point out that the Gaussian 
case can be proved analogously (albeit with different constants). Fix0<p<l/8 and suppose that 

m > C 3 ^1 + (1 + p ^)^^ nr log n 

measurement vectors have been chosen independently from an approximate 4-design. Theorem 28 then 
assures that the resulting measurement operator A obeys the robust Frobenius rank null space property 
with constants p < 1/8 and r < C^ly/rn with probability at least 1 — Likewise, Proposition 33 

assures that with probability at least 1 — setting t = -^( 1 , • ■ •, 1 )^ G leads to a positive definite 

W = obeying k{W) < 8 . Note that such a t obeys WtWi^ = If^/m and also 0 < p < 1/8 < 1/k{W) 

holds by construction. The union bound over these two assertions failing implies that the requirements of 
Theorem 31 are met with probability at least 

_ g-Csm _ 3-764^ > ^ 


where 7 denotes a sufficiently small absolute constant and C 4 = m/nr log n. The constants C 4 and s 
presented in Theorem 4 then amount to s = 7(74 and C 2 > C 4 . Inserting \\t\\i^ = Xj and the bounds on 
||bF||oo, \\W~^\\o 3 , niW) from Proposition 33 into (51) yields 


1 ^- 


^^^IA.||i+rl/P-V2||yL(^)_yL(X)||,J|W-l||oo 


- ^ 1 - 1 /p II-CIII . . 1100^ ^ 

"X.||4+8rVp-i/2||yi(Z)-^(X)|U, 


D\\W\l 


< 


7-l-l/P 


/rm 


m 


C. 




Adll + 


C4ri/p-i/2 


m 


\\A{Z) - A{X)\\,^ 


with constants C 3 = 16(7 and C 4 = 8 C + 8 DCe (where (7, D were introduced in Theorem 31 and Cq is ). □ 


Remark 34. In Corollary 6 we focus on recovering density operators, i.e., positive semidefinite matrices X 
with trace one. This trace constraint can be re-interpreted as an additional perfectly noiseless measurement 

60 = tr (id X) = tr(7f) = I 

corresponding to the measurement matrix Aq = id. Setting t = (1,0, ...,0)^ S in Theorem 31 

then leads to IF = id which obeys ||IF||oo = ||IF“^||oo = k{W) = 1 and furthermore assures that the 
endomorphism (48) is trivial, i.e. Z = Z for all Z G 1K„. Moreover, these properties render the estimate 
provided in Lemma 30 redundant, because any two density operators X, Z obey 

llZjli - llXjli = llZjli - IjZlli = tr {Z) - tr (X) = 0. 

Such a refinement then allows for dropping the term containing ||t ||^2 in (51) and by inserting IF = id we 
arrive at the following conclusion: Any measurement operator A that obeys the Frobenius robust rank null 
space property with constants 0 < p < 1 and r > 0 assures for 1 < p < 2 and any two density operators 

11^ - + r ^'^"~|^_^(^ + ^) ||7l(Z) - 71(A)||,2. 

Corollary 6 then follows from combining this assertion with Theorem 28 and setting p = 1. 
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Appendix 

A brief review of finite-dimensional quantum mechanics. For the sake of being self-contained we 
briefly recapitulate crucial concepts of (finite dimensional) quantum mechanics without going too much into 
detail. For further reading on the topics introduced here, we defer the interested reader to [53, Chapter 2.2]. 

An isolated quantum mechanical system is fully described by its density operator. For a finite n- 
dimensional quantum system, such a density operator corresponds to an Hermitian, positive semidefinite 
matrix p with unit trace. 

The most general notion of a measurement is that of a positive operator-valued measure (POVM). For an 
n-dimensional quantum system, a POVM corresponds to a collection M = of positive semidefinite 

n X n matrices that sum up to identity, i.e., 

Em = id. 

mGl 

The indices m G I indicate the possible measurement outcomes of performing such a POVM measurement. 
Upon performing M on a system described by p, quantum mechanics then postulates that the probability 
of obtaining the outcome (labeled by) m corresponds to 

p(m, p) = tr (Emp) ■ 

Repeating the same measurement (i.e., preparing p and measuring M) many times allows one to estimate 
the n probabilities p(\i, p) ever more accurately. 

Note that the definitions of p and M assure that p{m^ p)^^j is in fact a valid probability distribution. 
Indeed, p{rn, p) > 0 follows from positive-semidefiniteness of both p and Em- Unit trace of p assures proper 
normalization via 

X! P) = X! {Emp) = tr (id p) = tr(p) = 1. 

m^I m^I 
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